Corelab Seminar
2012-2013
Paris Koutris (University of Washington)
Query-Based Data Pricing
Abstract.
Data is increasingly being bought and sold online, and Web-based
marketplace services have emerged to facilitate these activities.However,
current mechanisms for pricing data are very simple: buyers can choose
only from a set of explicit views, each with a specific price.
In this paper, we propose a framework for pricing data on the Internet that,
given the price of a few views, allows the price of any query
to be derived automatically. We call this capability ``query-based
pricing.'' We first identify two
important properties that the pricing function must satisfy, called
arbitrage-free and discount-free. Then, we prove that there
exists a unique function that satisfies these properties and
extends the seller's explicit
prices to all queries. When both the views and the query are Unions
of Conjunctive Queries, the complexity of computing the price is
high. To ensure tractability, we restrict the explicit prices to be
defined only on selection views (which is the common practice
today). We give an algorithm with polynomial time data complexity for
computing the price of any chain query by reducing the problem to
network flow. Furthermore, we completely characterize the
class of Conjunctive Queries without self-joins that have PTIME data
complexity, and prove that pricing all other queries is NP-complete,
thus establishing a dichotomy on the complexity of the pricing problem when all views are selection queries.
joint work with Prasang Upadhyaya, Magda Balazinska, Bill Howe and Dan Suciu